---
title: Overview
---

## Basic Usage

ParadeDB supports aggregates through plain SQL functions like `COUNT`, `SUM`, and `AVG`. For instance, the following query tallies the total number of search results for a full text query.

```sql
SELECT COUNT(*) FROM mock_items
WHERE description @@@ 'shoes';
```

## Fast Aggregates

The performance of aggregates over very large result sets (millions of rows or more) is a well-known problem in Postgres.

In ParadeDB Enterprise, aggregates over full text search results are drastically accelerated by [fast fields](/documentation/indexing/fast_fields).
Fast fields are stored in a column-oriented fashion, which makes them much quicker to read and process.

The following steps are necessary to prepare the ParadeDB instance for fast fields.

### Configure Parallel Workers

ParadeDB Enterprise uses Postgres parallel workers. By default, Postgres allows two workers per parallel query.
Increasing the number of [parallel workers](/documentation/configuration/parallel) allows parallel queries to use all of the available hardware on the host machine and can deliver significant
speedups. For instance, if the host machine has 16 CPUs, `max_worker_processes`, `max_parallel_workers`, and `max_parallel_workers_per_gather`
should be set to 16.

<Note>
Remember to restart Postgres in order for `postgresql.conf` changes to take effect.
</Note>

### Run `VACUUM`

`VACUUM` updates the table's [visibility map](https://www.postgresql.org/docs/current/storage-vm.html),
which speeds up Postgres' visibility checks. Specifying `INDEX_CLEANUP false` allows `VACUUM` to run much
faster over large tables by skipping the cleanup of dead rows in the index.

<Note>
Make sure to pass the table name, not the index name, to `VACUUM`.
</Note>

```sql
VACUUM (INDEX_CLEANUP false) mock_items;
```

If the table experiences frequent updates, we recommend configuring [autovacuum](https://www.postgresql.org/docs/current/routine-vacuuming.html).

### Run `pg_prewarm`

The `pg_prewarm` extension can be used to preload data from the index into the Postgres buffer cache, which
improves the response times of "cold" queries (i.e. the first search query after Postgres has restarted).

```sql
CREATE EXTENSION pg_prewarm;
SELECT pg_prewarm('search_idx');
```

### Index Fast Fields

You can check which fields are indexed as fast by running `paradedb.schema`.

```sql
SELECT name, fast FROM paradedb.schema('search_idx') ORDER BY fast;
```

<Accordion title="Expected Response">
```csv
     name     | fast
--------------+------
 metadata     | f
 description  | f
 category     | f
 id           | t
 in_stock     | t
 rating       | t
 weight_range | t
 created_at   | t
 ctid         | t
(9 rows)
```
</Accordion>

In order for a query to be accelerated, **all** of the returned columns must be fast fields. For instance, the following query is accelerated because `rating` is a fast
field, and `COUNT(*)` does not specify any fields.

```sql
SELECT rating, COUNT(*) FROM mock_items
WHERE description @@@ 'shoes'
GROUP BY rating LIMIT 5;
```

You can verify if a query will be accelerated by running `EXPLAIN`. Accelerated queries will show either a `Custom Scan` or `Parallel Index Only Scan` in the query plan.

```sql
EXPLAIN SELECT rating, COUNT(*) FROM mock_items
WHERE description @@@ 'shoes'
GROUP BY rating LIMIT 5;
```

<Accordion title="Expected Response">
```csv
                                                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=10.01..10.05 rows=2 width=12)
   ->  GroupAggregate  (cost=10.01..10.05 rows=2 width=12)
         Group Key: rating
         ->  Sort  (cost=10.01..10.02 rows=2 width=4)
               Sort Key: rating
               ->  Custom Scan (ParadeDB Scan) on mock_items  (cost=10.00..10.00 rows=2 width=4)
                     Table: mock_items
                     Index: search_idx
                     Fast Fields: rating
                     Exec Method: NumericFastFieldExecState
                     Scores: false
                     Tantivy Query: {"with_index":{"oid":3748196,"query":{"parse_with_field":{"field":"description","query_string":"shoes","lenient":null,"conjunction_mode":null}}}}
(12 rows)
```
</Accordion>

On the other hand, a query that selects a non-fast field like `description` is not accelerated.

```sql
SELECT description, COUNT(*) FROM mock_items
WHERE description @@@ 'shoes'
GROUP BY description LIMIT 5;
```

If we run `EXPLAIN` on this query, we'll see a regular `Index Scan`, which indicates that no fast field optimization occurred.

```sql
EXPLAIN SELECT description, COUNT(*) FROM mock_items
WHERE description @@@ 'shoes'
GROUP BY description LIMIT 5;
```

<Accordion title="Expected Response">
```csv
                                                                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=14.04..14.08 rows=2 width=40)
   ->  GroupAggregate  (cost=14.04..14.08 rows=2 width=40)
         Group Key: description
         ->  Sort  (cost=14.04..14.04 rows=2 width=32)
               Sort Key: description
               ->  Index Scan using search_idx on mock_items  (cost=10.00..14.03 rows=2 width=32)
                     Index Cond: (id @@@ '{"with_index":{"oid":3567455,"query":{"parse_with_field":{"field":"description","query_string":"shoes","lenient":null,"conjunction_mode":null}}}}'::paradedb.searchqueryinput COLLATE "default")
(7 rows)
```
</Accordion>

That's it! ParadeDB is now ready for fast aggregates.
